klotz: production engineering*

Production Engineering focuses on the design, implementation, and management of systems and processes to ensure the efficient and reliable delivery of software and services in a production environment. It involves various aspects such as deploying, monitoring, and maintaining applications, managing infrastructure, and handling data pipelines. Production Engineering KPIs include Availability and Cost.

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. A configuration as code language with rich validation and tooling.
  2. Platform Engineering Labs has released formae, an open-source infrastructure-as-code platform designed to address limitations in existing tools, focusing on automatic discovery, codification of existing infrastructure, and a reconcile/patch workflow. It uses PKL instead of HCL and targets reducing drift and complexity.
  3. This article details how Nubank built its own in-house logging platform to address issues of cost, scalability, and control over their logging infrastructure. Initially reliant on a vendor solution, they found costs rising unpredictably and experienced limitations in observability and data retention.

    To solve this, Nubank divided the project into two major steps: **The Observability Stream** (ingestion and processing) and the **Query & Log Platform** (storage and querying).

    * **Observability Stream:** Fluent Bit for data collection, a Data Buffer Service for micro-batching, and an in-house Filter & Process Service.
    * **Query & Log Platform:** Trino as the query engine, AWS S3 for storage, and Parquet for data format.

    The new platform currently ingests 1 trillion logs daily, stores 45 PB of searchable data with a 45-day retention, and handles almost 15,000 queries daily. Nubank reports the platform costs 50% less than comparable market solutions while providing them with greater control, scalability, and the ability to customize features. The project underscored Nubank's value of challenging the status quo and leveraging a combination of open-source and in-house development.
  4. An effort to create a fully functional Kubernetes cluster with 1 million active nodes. The article details the challenges and solutions for scaling Kubernetes to this size, covering networking, state management (etcd), and the scheduler.
  5. This paper provides a theoretical analysis of Transformers' limitations for time series forecasting through the lens of In-Context Learning (ICL) theory, demonstrating that even powerful Transformers often fail to outperform simpler models like linear models. The study focuses on Linear Self-Attention (LSA) models and shows that they cannot achieve lower expected MSE than classical linear models for in-context forecasting, and that predictions collapse to the mean exponentially under Chain-of-Thought inference.
  6. This article explores how prompt engineering can be used to improve time-series analysis with Large Language Models (LLMs), covering core strategies, preprocessing, anomaly detection, and feature engineering. It provides practical prompts and examples for various tasks.
  7. Dozzle is a lightweight, self-hosted solution that provides a real-time look into your container logs, offering an intuitive UI, real-time logging, intelligent search, and support for multiple use cases like home labs and local development.
  8. TraceRoot.AI is an AI-native observability platform that helps developers fix production bugs faster by analyzing structured logs and traces. It offers SDK integration, AI agents for root cause analysis, and a platform for comprehensive visualizations.
  9. TraceRoot accelerates the debugging process with AI-powered insights. It integrates seamlessly into your development workflow, providing real-time trace and log analysis, code context understanding, and intelligent assistance. It offers both a cloud and self-hosted version, with SDKs available for Python and JavaScript/TypeScript.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: production engineering

About - Propulsed by SemanticScuttle